fix(lookup_error): weak fuzzy hits no longer suppress semantic fallback by critesjosh · Pull Request #20 · AztecProtocol/mcp-server

critesjosh · 2026-05-02T01:01:10Z

Summary

Reported in the v1.20.0 dogfood test: aztec_lookup_error("note already nullified") returned the unrelated catalog entry "Contract already initialized" with a Jaccard word-overlap score of 54, and the semantic-documentation fallback never fired. The early-return in lookupAztecError treated any totalMatches > 0 as authoritative regardless of confidence, so a noise-floor fuzzy match shadowed the actually-useful answer.

Fix

Introduce STRONG_MATCH_THRESHOLD = 70. Catalog hits below the threshold no longer short-circuit semantic fallback — they're kept in the result so the formatter can still render them as low-confidence cues, but the tool now falls through to the semantic path.

The threshold aligns with the existing score system in src/utils/error-lookup.ts:

Score	Match type	Effect
100	`exact-code` / `hex-signature`	strong → short-circuits
95	`exact-pattern`	strong → short-circuits
70-80	`substring`	strong → short-circuits (boundary at 70)
50-65	`word-overlap` (Jaccard)	weak → falls through to semantic

A codeMatch (ripgrep over cloned source) by itself still short-circuits — those are direct grep hits with no fuzziness.

When weak hints exist alongside semantic results, the message field names the situation explicitly ("No strong static match — N low-confidence fuzzy hint(s) shown below. Showing relevant documentation.") rather than pretending nothing matched.

Tests added (6)

threshold boundary: score === 70 still short-circuits
codeMatch alone (no catalog) still short-circuits
regression: "note already nullified" with score-54 word-overlap → semanticHealth: "ok", semanticResults populated, weak hint preserved in result.catalogMatches, message contains "low-confidence"
multiple weak hints (max score 65) still fall through
weak hints + no client → "skipped", message names the situation and points at API_KEY
weak hints + semantic returning empty → "no_results", message acknowledges both signals

tsc --noEmit clean. All 247 tests pass (was 241).

Test plan

npm run build (tsc)
npx vitest run
After release: re-run aztec_lookup_error("note already nullified") against v1.21+ and confirm semantic fallback returns documentation about note lifecycle / nullification.

🤖 Generated with Claude Code

Reported in v1.20.0 dogfood: `aztec_lookup_error("note already nullified")` returned the unrelated catalog entry "Contract already initialized" with a Jaccard word-overlap score of 54, and the semantic-documentation fallback never fired because the early-return treated any catalog hit — regardless of confidence — as authoritative. Fix: introduce STRONG_MATCH_THRESHOLD = 70. Below that, the catalog hits are kept in the result (so the formatter can still render them as low-confidence cues), but the tool falls through to the semantic fallback. The threshold aligns with the score system in utils/error-lookup.ts: 100 exact-code / hex-signature → strong, always short-circuits 95 exact-pattern → strong 70-80 substring → strong (boundary at 70) 50-65 word-overlap (Jaccard) → weak, falls through When weak hints exist alongside the semantic results, the message field now names the situation ("No strong static match — N low-confidence fuzzy hint(s) shown below. Showing relevant documentation.") instead of pretending nothing matched. Tests added (6): - threshold boundary: score === 70 still short-circuits - codeMatch alone (no catalog) still short-circuits - regression: "note already nullified" with score-54 word-overlap → semanticHealth='ok', semanticResults populated, weak hint preserved in result.catalogMatches, message contains "low-confidence" - multiple weak hints (max score 65) still fall through - weak hints + no client → "skipped", message names the weak-hint situation and points at API_KEY - weak hints + semantic returning empty → "no_results", message acknowledges both signals All 247 tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- Formatter: when semantic results exist alongside weak-only catalog hits, render Documentation FIRST, then "## Lower-Confidence Catalog Hints" with an italicized note that the docs above are likely more authoritative. Prevents the LLM consumer from anchoring on a misleading top hit (the original "Contract already initialized" failure mode rendered the bogus entry under "## Known Errors" with full **bold name** + cause/fix before the actually-relevant docs). - category filter preserved: when the caller passes `category` and the catalog produced any in-category match (even a weak one), keep the pre-PR short-circuit. Falling through to a category-agnostic semantic search would surface out-of-scope docs and confuse a user who explicitly narrowed the request. - API_KEY guidance reworded from "An API_KEY would enable..." to "Set API_KEY... (get a free key by running /mcp-key in the Aztec/Noir Discord: https://discord.gg/xMud5StFyA)" — actionable next-step phrasing matches the rest of the project's wording. - 3 more tests covering the codex coverage gaps: * weak-only + version-mismatch: gate blocks semantic, weak hint preserved, message names both signals. * weak-only + allowVersionMismatch=true: gate skipped, semantic runs. * category filter + weak-only: short-circuits (does NOT fall through to category-agnostic semantic). 250 tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Codex round-2 finding: the weak-only catalog note `_These are word-overlap fuzzy matches, not direct hits — the documentation results above are likely more authoritative._` rendered even when there were no semantic results above (no client, version mismatch, backend failed, semantic returned empty). Stale copy could mislead a user into thinking they should look up at a documentation section that doesn't exist. Now the "documentation above" phrasing is gated on `renderSemanticFirst` (which already encodes "semantic returned hits AND we reordered"). The other weak-only paths use neutral copy: `_Treat as low-confidence cues only._` Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

github-actions · 2026-05-02T01:26:31Z

🎉 This PR is included in version 1.21.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

critesjosh and others added 3 commits May 2, 2026 01:00

critesjosh force-pushed the feat/lookup-error-score-threshold branch from 38ca409 to 23d47f2 Compare May 2, 2026 01:21

critesjosh merged commit 8f00f0c into main May 2, 2026
6 checks passed

critesjosh deleted the feat/lookup-error-score-threshold branch May 2, 2026 01:25

github-actions Bot added the released label May 2, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(lookup_error): weak fuzzy hits no longer suppress semantic fallback#20

fix(lookup_error): weak fuzzy hits no longer suppress semantic fallback#20
critesjosh merged 3 commits intomainfrom
feat/lookup-error-score-threshold

critesjosh commented May 2, 2026

Uh oh!

Uh oh!

github-actions Bot commented May 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

critesjosh commented May 2, 2026

Summary

Fix

Tests added (6)

Test plan

Uh oh!

Uh oh!

github-actions Bot commented May 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant